An Empirical Study on Compositionality in Compound Nouns
نویسندگان
چکیده
A multiword is compositional if its meaning can be expressed in terms of the meaning of its constituents. In this paper, we collect and analyse the compositionality judgments for a range of compound nouns using Mechanical Turk. Unlike existing compositionality datasets, our dataset has judgments on the contribution of constituent words as well as judgments for the phrase as a whole. We use this dataset to study the relation between the judgments at constituent level to that for the whole phrase. We then evaluate two different types of distributional models for compositionality detection – constituent based models and composition function based models. Both the models show competitive performance though the composition function based models perform slightly better. In both types, additive models perform better than their multiplicative counterparts.
منابع مشابه
An Analysis of Persian Compound Nouns as Constructions
In Construction Morphology (CM), a compound is treated as a construction at the word level with a systematic correlation between its form and meaning, in the sense that any change in the form is accompanied by a change in the meaning. Compound words are coined by compounding templates which are called abstract schemas in CM. These abstract constructional schemas generalize over sets of existing...
متن کاملExploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds
This paper explores two hypotheses regarding vector space models that predict the compositionality of German noun-noun compounds: (1) Against our intuition, we demonstrate that window-based rather than syntax-based distributional features perform better predictions, and that not adjectives or verbs but nouns represent the most salient part-of-speech. Our overall best result is state-of-the-art,...
متن کاملAssociation Norms of German Noun Compounds
This paper introduces association norms of German noun compounds as a lexical-semantic resource for cognitive and computational linguistics research on compositionality. Based on an existing database of German noun compounds, we collected human associations to the compounds and their constituents within a web experiment. The current study describes the collection process and a part-of-speech an...
متن کاملMultiword Expressions in Child Language
The goal of this work is to introduce CHILDES-MWE, which contains English CHILDES corpora automatically annotated with Multiword Expressions (MWEs) information. The result is a resource with almost 350,000 sentences annotated with more than 70,000 distinct MWEs of various types from both longitudinal and latitudinal corpora. This resource can be used for large scale language acquisition studies...
متن کاملReverse-engineering Language: A Study on the Semantic Compositionality of German Compounds
In this paper we analyze the performance of different composition models on a large dataset of German compound nouns. Given a vector space model for the German language, we try to reconstruct the observed representation (the corpusestimated vector) of a compound by composing the observed representations of its two immediate constituents. We explore the composition models proposed in the literat...
متن کامل